Telco Customer Churn

Telco Customer Churn

In this article, we analyze and predict customer churn for Telco Customer Churn data.

Dataset

Columns Description
customerID Customer ID
gender Whether the customer is a male or a female
SeniorCitizen Whether the customer is a senior citizen or not (1, 0)
Partner Whether the customer has a partner or not (Yes, No)
Dependents Whether the customer has dependents or not (Yes, No)
tenure Number of months the customer has stayed with the company
PhoneService Whether the customer has a phone service or not (Yes, No)
MultipleLines Whether the customer has multiple lines or not (Yes, No, No phone service)
InternetService Customer’s internet service provider (DSL, Fiber optic, No)
OnlineSecurity Whether the customer has online security or not (Yes, No, No internet service)
OnlineBackup Whether the customer has an online backup or not (Yes, No, No internet service)
DeviceProtection Whether the customer has device protection or not (Yes, No, No internet service)
TechSupport Whether the customer has tech support or not (Yes, No, No internet service)
StreamingTV Whether the customer has streaming TV or not (Yes, No, No internet service)
StreamingMovies Whether the customer has streaming movies or not (Yes, No, No internet service)
Contract The contract term of the customer (Month-to-month, One year, Two years)
PaperlessBilling Whether the customer has paperless billing or not (Yes, No)
PaymentMethod The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))
MonthlyCharges The amount charged to the customer monthly
TotalCharges The total amount charged to the customer
Churn Whether the customer churned or not (Yes or No)

Loading the Dataset

Variance of the Features

Let's take a look at the variance of the features.

Furthermore, we would like to standardize features by removing the mean and scaling to unit variance. In this article, we demonstrated the benefits of scaling data using StandardScaler().

Saving to a CSV

Train and Test Sets

First, consider the data distribution for Term Deposit Subscription.

StratifiedKFold is a variation of k-fold which returns stratified folds: each set contains approximately the same percentage of samples of each target class as the complete set.

Modeling: Decision Tree Classifier with Voting Classifier

Decision Tree Classifier uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves).

Now, we can develop a model using the best parameters for the DTC. Here, we use 10 stratified randomized folds made by preserving the percentage of samples for each class.

Similarly, for Confusion Matrix, we take the average of the confusion matrices of all folds.

Note that:

where $T_p$, $T_n$, $F_p$, and $F_n$ represent true positive, true negative, false positive, and false negative, respectively.

\begin{align} \text{Precision} = \frac{T_p}{T_p+F_p} \end{align} \begin{align} \text{Recall} = \frac{T_p}{T_p + F_n} \end{align}

However, the accuracy can be a misleading metric for imbalanced data sets. Here, over 88 percent of the sample has negative (No) and about 12 percent has positive (Yes) values. In these cases, a balanced accuracy (bACC) [4] is recommended that normalizes true positive and true negative predictions by the number of positive and negative samples, respectively, and divides their sum by two:

\begin{align} \text{TPR} &= \frac{T_p}{T_p + F_n},\\ \text{TNR} &= \frac{T_N}{T_p + F_p},\\ \text{Balanced Accuracy (bACC)} &= \frac{1}{2}\left(\text{TPR}+\text{TNR}\right) \end{align}

Conclutions

To improve the results, we can implement an iterative optimization deep learning method.


References

  1. Kaggle Telco Customer Churn data
  2. scikit-learn classifiers
  3. Precision and recall Wikipedia
  4. Mower, Jeffrey P. "PREP-Mt: predictive RNA editor for plant mitochondrial genes." BMC bioinformatics 6.1 (2005): 1-15.